RODHA: Robust Outlier Detection using Hybrid Approach
نویسندگان
چکیده
The task of outlier detection is to find the small groups of data objects that are exceptional to the inherent behavior of the rest of the data. Detection of such outliers is fundamental to a variety of database and analytic tasks such as fraud detection and customer migration. There are several approaches[10] of outlier detection employed in many study areas amongst which distance based and density based outlier detection techniques have gathered most attention of researchers. In informat ion theory, entropy is a core concept that measures uncertainty about a stochastic event, and it means that entropy describes the distribution of an event. Because of its ability to describe the distribution of data, entropy has been applied in clustering applications in data mining. In this paper, we have developed a robust supervised outlier detection algorithm using hybrid approach (RODHA) which incorporates both the concept of distance and density along with entropy measure while determining an outlier. We have provided an empirical study of different existing outlier detection algorithms and established the effectiveness of the proposed RODHA in comparison to other outlier detection algorithms.
منابع مشابه
Simultaneous robust estimation of multi-response surfaces in the presence of outliers
A robust approach should be considered when estimating regression coefficients in multi-response problems. Many models are derived from the least squares method. Because the presence of outlier data is unavoidable in most real cases and because the least squares method is sensitive to these types of points, robust regression approaches appear to be a more reliable and suitable method for addres...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملOutlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...
متن کاملAnalysis of a Problem Using Various Visions
In this paper an applied problem, where the response of interest is the number of success in a specific experiment, is considered and by various visions is studied. The effects of outlier values of response on results of a regression analysis are so important to be studied. For this reason, using diagnostic methods, outlier response values are recognized. It is shown that use of arc-sine ...
متن کاملHybrid Approach for Outlier Detection in High Dimensional Data
It has been observed recently that the prominence of multidimensional data is increasing. Existing outlier detection techniques generally fail to work on multi-dimensional data. The need for analyzing high dimensional data has thus increased in today’s data trends. It has enormous application in medical domain, network intrusion and satellite imagery. Even though there are existing methodologie...
متن کامل